Modified conjugated gradient method for diagonalising large matrices 
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We present an iterative method to diagonalise large matrices. The basic idea is the same as the 
conjugated gradient (CG) method, i.e, minimizing the Rayleigh quotient via its gradient and avoid- 
ing reintroduce errors to the directions of previous gradients. Each iteration step is to find lowest 
eigenvector of the matrix in a subspace spanned by the current trial vector and the corresponding 
gradient of the Rayleigh quotient, as well as some previous trial vectors. The gradient, together 
with the previous trail vectors, play a similar role of the conjugated gradient of the original CG 
algorithm. Our numeric tests indicate that this method converges significantly faster than the orig- 
inal CG method. And the computational cost of one iteration step is about the same as the original 
CG method. It is suitably for first principle calculations. 

PACS numbers: 02.70.-c, 31.15.Ar, 31.15.-p, 95.75.Pq 
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I. INTRODUCTION 

In first principle calculations, such as band struc- 
ture calculations, atomic and molecular structure cal- 
culations, one of the basic tasks is to find several low- 
est eigenvalues and the corresponding eigenvectors iter- 
atively (and very often self-consistently) of the effective 
Hamiltonian The matrix dimension of the Hamilto- 
nian may range from tens of thousands to several mil- 
lions, and one may need up to several thousands of the 
lowest eigenvectors of the effective Hamiltonian. Diag- 
onalising of matrices in such a scale needs considerable 
CPU time and memories. It is one of the major numerical 
costs in the first principle calculations, and the efficiency 
of the algorithm is crucial for the performances of the 
whole program. There are many efforts to improve the 
algorithm [111111113. 

Among widely used algorithms, such as Lanczos [ll3|j 
Dividson j^j , relaxation method ^ , DIIS ( Direct In- 
version in the Iterative Subspace, which minimizes all 
matrix elements between all trial vectors) T^, and its 
later version RMM-DIIS H (RMM stands for resid- 
ual minimization, i.e., minimizing the norm of the resid- 
ual vector in iterative subspace), the conjugated gradient 
(CG) method 0,0 is a valuable tool to find a set of low- 
est eigenvectors of a large matrix. Briefly speaking, to 
obtain the lowest eigenvector of a matrix H for the gen- 
eral form eigenvalue problem 
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the CG method iteratively minimizes the Rayleigh quo- 
tient 
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where S is the overlap matrix, and |(/)„) is a refined trial 
vector at step n. Each iteration step is to search the mini- 



mization point in the direction of the conjugated gradient 
which is a combination of the current gradient and pre- 
vious conjugated gradient. One can obtain higher eigen- 
vectors in the same way, provided to keep the trial vector 
orthogonal to the lower eigenvectors. In practical calcu- 
lations, the CG method is stable and reasonably efficient 
in many cases, and it is easy to implement. The itera- 
tion procedure needs only to store the trial vector and its 
gradient, as well as one previous conjugated gradient. 

Conjugated gradient method is originally designed to 
minimize positive definite quadratic functions iteratively. 
In n-th step of iteration, The CG method is equivalent to 
find a minimum in a 7i-dimensional subspace spanned by 
the initial trial vector and the subsequent n—l gradients 
of the quadratic function. Due to special properties of 
a quadratic function, one needs only do the minimiza- 
tion in a two dimensional space spanned by current state 
and the conjugated gradient which is a combination of 
current gradient and last step's conjugated gradient. In 
principle, one needs at most N steps to obtain final so- 
lution in a A'^ dimensional space. Practical calculations 
usually needs more steps due to round off errors. The 
conjugated gradient method is virtually the most effec- 
tive method to minimize a quadratic function iteratively. 
And it is a formally established algorithm to solve the 
linear algebraic equation. 

For general functions, such as Rayleigh quotient, there 
are several ways to define the conjugated gradient, and 
the behaviors of conjugated gradient algorithm are un- 
clear. However, near an exact minimum point, any func- 
tion behaves like a quadratic function. If one starts with 
a good guess, one may find solution very quickly. This 
partially explains the successes of the conjugated gradi- 
ents method in diagonalising a large matrix. 



II. THE MODIFIED CONJUGATED 
ALGORITHM 
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Our method is based on the following two observations: 
Firstly, each iteration step of minimizing the Rayleigh 
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quotient by CG algorithm is equivalent to find lowest 
eigenvector in a two dimensional subspacc. The subspace 
at n-th step is spanned by the current state \(t>n), and the 
conjugated gradient \Fn)- Note that the conjugated gra- 
dient \Fn) is a combination of the gradient of n-th step's 
Rayleigh quotient, and the (n — l)-th step's conjugated 
gradient |i^„_i). One may expect a better result at n-th 
iterative step by finding lowest eigenvector in a three di- 
mensional subspace spanned by \(j)„), |G„), and I^F^-i), 
where |G„) is the gradient of the Rayleigh quotient at 
n-th step. 

Secondly, we note that, within the CG algorithm, \(f>n) 
is a combination of |(/)„_i) and |-F„_i). Thus the three 
dimensional subspace spanned by |G„), and |-Fn-i) 
is the same as the subspace spanned by | \Gn), and 
|(/)„_i). This means that one may obtain a better result 
{(f'n+i) at n-th step by replace the n-step iteration of CG 
algorithm with finding lowest eigenvector at the three 
dimensional subspace spanned by | (/>„), |G„), and |(?!)„_i). 
Of course, the result will further improved if one finds the 
lowest eigenvector in a larger subspace spanned by \4>n)-, 

\Gn), \<l>n-l), Ift'n-m+l)- 

The above observations indicate that one may improve 
the efficiency of the CG algorithm by replacing each iter- 
ation step of the CG algorithm with finding lowest eigen- 
vector in a small subspace spanned by the current vec- 
tor |0„) and the corresponding gradient as well as 
some previous vectors |(/>„_2), • • • ■ In our numeric 

tests, the effect is significant in many cases. Since diago- 
nalising a small matrix of several dimension is very cheap 
numerically, each step's numeric cost of the modified ver- 
sion is about the same as the original CG algorithm. 

Practical implementation of the modified conjugated 
gradient method is similar to that of original CG method. 
For finding one single lowest eigenvalue and its corre- 
sponding eigenvector, it goes through the following steps: 

1. Choose the dimension M of the iteration subspace, 
and the maximum iteration step N^ax- In our 
numerical test, it is enough to set the dimension 
M < 10. In many case, M = 2> works quite well. 
In this case, the 3 dimensional subspace is spanned 
by current trial vector | <?!)„), the corresponding gra- 
dient \Gn) and one previous trial vector |<?^n-i) ob- 
tained in the last step. 

2. Choose an initial normalized trial vector \4>q), 
((^ol'S'l^o) = 1; and calculate the expectation value 
(Rayleigh quotient) Eq = {(j}Q\H\(t)o) . 

3. For n = 0, 1, 2, • • • , Nmax, do the following itera- 
tion loop to refine the trial vector from |^o) to 
|</'2),---: 

(a) Calculate the gradient of the Rayleigh quo- 
tient 

|G„) = if |<^„) - £„|<^„). (3) 

Here the refined trial function |0„) is normal- 
ized at the end of each iteration. 



(b) In the m dimensional subspace spanned by 

m = |G„), 1V2) = iv-s) = 

IV'm) = \4>n-m+2), Calculate the matrix ele- 
ments of the matrix H, Tiij = {ipi\H\tpj), and 
the overlap matrix of the basis vector Sij = 
(ipilSltpj) . Here, the dimension m = n -|- 1 if 
n-h 1 < M, otherwise m = M, i.e., in the first 
M — 1 loops, the subspace has only n -|- 1 basis 
vector. 

(c) find the lowest eigenvalue e and eigenvector (p 
for the general form eigenvalue problem 

Hf = eS(p. (4) 

(d) From the above eigenvector ip, construct the 
refined trial vector |0„+i), 

m 

\(l)„+i} ^^ifili^i), (5) 

i=l 

and calculate the expectation value En+i = 

{(j)„+l\H\(l)n+l). 

(e) If \En+i — En\ is less than a required value or 
n > N^ax: stop the iteration loop, otherwise 
continue the iteration loop. 

Impose a maximum iteration step is necessary in many 
cases. For example, in self-consistent calculations, one 
needs to update the Hamiltonian after some steps of it- 
erations. The trial vector can be chosen, in principle, 
arbitrarily, provided it is not orthogonal with the lowest 
eigenvector. However, even if the initial trial vector does 
accidently orthogonal to the lowest eigenvector, due to 
the numeric round off errors in the iterations, one can 
always arrive the lowest eigenvector. 

Check the convergence is usually testing the difference 
between the trial vector and its refined version after an 
iteration. In our numeric tests, check the difference be- 
tween two consecutive trial vectors' Rayleigh quotients 
also works well. And it is numerically faster. 

For large matrices, calculation of the gradient is a main 
numeric task in each loop of iteration. It involves a mul- 
tiplication of matrix and vector. Other numeric costs are 
mainly the calculation of the matrix elements Tijj and 
Sij in the small subspace , as well as the combination of 
the gradient and previous trial vectors to form a refined 
trial vector. The numeric cost of diagonalising the small 
matrix T-L is almost nothing as compared with other op- 
erations. In each loop of iteration, the subspace changes 
two basis vectors, i.e., the current gradient |G„) replaces 
the previous one |G„_i), and the refined trial vector |0„) 
replace the old one \(j)n-m+2)- One needs only to calcu- 
late the matrices elements Hij and Sij related to the two 
vectors in each iteration loop. If the subspace is three 
dimensional, the numerical cost of one iteration loop is 
about the same as that of original CG method. 

After finding the lowest eigenvector, one can find the 
second lowest one in a similar way. One starts with a 
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trial vector orthogonal to the lowest eigenvector, and in 
following iterations, gradients of the Rayleigh quotient, 
as well as the updated trial vectors, must be kept orthog- 
onal to the lowest eigenvector. Similarly, after working 
out k lowest eigenvectors, the k + 1 eigenvector can be 
worked out by maintaining the orthogonality with k lower 
eigenvectors. 

In this strict sequential procedure, the accuracy of 
lower eigenvectors affect the higher ones. A remedy to 
this problem, according to Ref . z] , is re-diagonalising the 
matrix in the subspace spanned by the refined trial vec- 
tors, which is referred as subspace rotation in Q]. After 
this subspace rotation, one can use these resulted vec- 
tor as trial vectors for further iteration to improve the 
accuracy. In practical implementations, we only iterate 
every trial vector for some steps, then perform a subspace 
rotation. The convergence check is to test the eigenval- 
ues differences between two consecutive subspace rota- 
tion. This procedure improves the overall efRciency. In 
Ref. 0, there is a detailed discussion on the role of the 
subspace rotation. 



III. NUMERICAL RESULTS 

We test the efficiency of the above outlined algorithm 
by comparing its performance with other algorithms for 
various matrices. In all cases, the modified CG algorithm 
outperforms the original CG algorithm. We observe sig- 
nificant improvement to the convergence rate in many 
cases. 

As an illustration, we show in figure 1 a typical result 
for a banded matrix with bandwidth 2L. The matrix's 
diagonal element is an = 2\/i — a, and its off-diagonal 
elements within the band width is a constant = a. 
Due to its simple form and its relation with Hamiltonian 
describing the pairing effects, this matrix has been in- 
vestigated by some other authors, see, e.g. Here we 
choose the matrix's dimension N = 200000 with half- 
bandwidth L — 300, the parameter a is set to be 20. 
For finding first 8 lowest eigenvectors, the modified CG 
algorithm converges within 100 steps with an accuracy 
of machine's precision limit. It is more than three times 
faster than the original CG algorithm. As a comparison, 
we also show the result for the block Lanczos method , 
as well as the steepest decent method. In Figure 1, the 
convergence rate of one iteration step is defined as the 
relative error of the two consecutive Rayleigh quotients, 
{En - En-i)/[{En + £^„-i)/2], where En-i and E^ are 
two consecutive Rayleigh quotients. When every eigen- 
value reaches the required accuracy, we perform a sub- 
space rotation and repeat the iteration. Convergence is to 
test the corresponding relative error for every eigenvalue 
between two consecutive rotations. In our implementa- 
tion, the maximum iteration number N^ax = 500, i.e., 
we go at most 500 steps of iteration for each trial vector 
before performing a subspace rotation. 

In the above calculations, we use a 3-dimensional iter- 
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FIG. 1: Convergence rate of the modified conjugated gradient 
method in comparing with other algorithms. 



ation subspace for the modified CG algorithm, i.e., the 
subspace is constitute of the current trial vector, its cor- 
responding gradient, as well as one previous trial vector. 
In such case, each iteration step needs to calculate one 
gradient, and some combinations of the three vectors, 
as well as solving a 3-dimensional eigenvalue problem. 
From the above argument, when the iteration subspace 
is 3-dimensional, the numeric cost of each iteration step 
is almost the same as that of the original CG method. 
For the block Lanczos algorithm, however, to ensure a 
reasonable convergence rate, the iteration subspace is 50 
dimensional, i.e., one needs to calculate 50 gradients for 
each iteration step. To our experience, on step of Lanc- 
zos iteration needs longer CPU time than 50 steps of the 
modified CG method. Thus, one Lanczos step is counted 
as 50 steps in Figure 1. 

In the 3-dimensional iteration subspace spanned by 
{\Gn), l^n), the gradient vector \Gn), together 

with the previous trial vector play the same role 

as that of the conjugate gradient in the minimization of a 
quadratic function. This is especially the case when the 
Rayleigh quotient closes to the minimum point, i.e., it is 
approximately a quadratic function of the iteration trial 
vector. In fact, without the previous trial vector |(/)„_i), 
the lowest eigenvector obtained in the 2-dimensional sub- 
space spanned by{|G'„), \(j)n)} is just the result of steep- 
est descent method. By including one previous trial vec- 
tor which contains information about previous gradients, 
one is able to prevent reintroduction of errors to the re- 
fined trial vector in the direction of previous gradients. 
This is the reason we call this method as modified CG 
algorithm. 

On the other hand, in the sense of relaxation al- 
gorithm for finding lowest eigenvector 0, 0|, the re- 
fined trial vector |0„) at step n, is an approxi- 
mation to the lowest eigenvector of the matrix in 
the subspace spanned by {\(j)o), |Gi), \G2), ••■ , |G„)}, 
which is equivalent to the subspace spanned by 
{|</)o), 102), ■•• , \(pn-i), \(pn)}- According to the 

relaxation algorithm, to find the lowest eigenvector in 



4 



the subspace spanned by the above basis vectors, one 
starts from an initial trial vector \ipo), and minimizes 
the Rayleigh quotient iteratively. Each iterative step is 
to minimize the Rayleigh quotient in a two dimensional 
subspace spanned by the (updated) trial vector, and one 
basis vector. The basis vector can be chosen consecu- 
tively from the first one to the last one. After going 
through all basis vectors, one continues the next round 
of iteration by choosing the first basis vector as next ba- 
sis vector. This iteration will converge after goes through 
all basis vectors several rounds. Note that, if one starts 
with the first basis vector |0o) &s initial trial vector, in 
the two dimensional subspace spanned by two consecu- 
tive basis vector and the second basis vec- 
tor 1 minimizes the Rayleigh quotient. After going 
through all basis vector for one round, the refined trial 
vector is | which represents an approximate lowest 
eigenvector in the above subspace. 

The above two factors explain the rapid convergence of 
the modified CG algorithm. One consequence from the 
above arguments is that, if wc increase the dimension of 
the iteration subspace by including more previous trial 
vectors, the convergence rate will not increase too much. 
In other words, one needs only do the modified CG algo- 
rithm in a small iteration space. To our experience, one 
needs at most 5-dimensional iteration subspace. In most 
cases, it is enough to do the iteration in the 3-dimensional 
iteration subspace. Figure 2 shows our numeric result 
to confirm this property of the modified CG algorithm. 
Here we do the same calculation using different iteration 
subspace. The filled circle connected line is the same as 
figure 1 with 3-dimensional iteration subspace, and the 
filled square and triangle are results for 6 dimensional and 
12 dimensional iteration subspace respectively. There is 
almost no difference within 50 steps where the the con- 
vergence rate is about 10~^. One needs almost the same 
iteration steps to arrive the final precision. However, the 
3-dimensional iteration runs faster for each iteration step 
since it involves less combination and production of the 
basis vectors that span the iteration subspace. 
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FIG. 2: Convergence rates of the modified CG algorithm for 
difi'erent dimensions of the iteration subspace. 



For some matrices or some properly chosen initial 
trial vectors, the Rayleigh quotient are approximately 
quadratic functions of the trial vectors. In such cases, 
the modified CG algorithm converges in almost the same 
rate as the original CG algorithm. And a trial vector 
(f)n at step n, is an almost exact minimum in the sub- 
space spanned by {|0o), I"/"!), ■ • • , l^n), IGn)}. We have 
encountered such cases in our numeric tests. In fact, near 
a minimum, any function behaves like a quadratic func- 
tion. Some matrices with special structures also make 
the Rayleigh quotient like a quadratic function in a quite 
large region of the vector space. For such matrices, the 
CG method is indeed a very efficient method. Of cause, 
in any cases the modified CG method always outperforms 
the original CG method. 

The refined trial vector becomes closer and closer 
to the previous step's trial vector when iteration 

closes to final solution. In higher dimensional iteration, 
one may encounter (numerical) degeneracy of basis vec- 
tors that span the iteration subspace. This problem is 
easy to solve. One simple solution is to replace this step 
by an steepest descent's step. Other more sophisticated 
way is to choose some independent vectors from the ba- 
sis vectors and do this step in a small subspace. Both 
methods are easy to implement. In fact, one can detect 
the degeneracy when solving the general form eigenvalue 
problem |0J which can be conveniently solved by the con- 
ventional Choleski-Householder procedure 0| . If there 
is a degeneracy, the Choleski decomposition of the over- 
lap matrix S returns an error code. When this happens, 
one can simply redo this step with a steepest descent step. 
Alternatively, one can use a more sophisticated Choleski 
decomposition program that automatically chooses inde- 
pendent basis vector. In doing so, one must adjust the 
the matrix element of Ti. simultaneously. This two meth- 
ods need almost the same numerical cost. Of course, 
the first method is easy to implement. In our numerical 
tests, there is almost no degeneracy in the 3-dimensional 
iteration subspace. 

It is straightforward to implement preconditioning 
treatment for the modified CG algorithm. Precondition- 
ing treatment can significantly improve the convergence 
rate for matrices with large difference between lowest 
and highest eigenvalues. Due to the fact that there is 
no need to construct explicitly the conjugated gradient 
in the modified CG algorithm, it is easier to implement 
the preconditioning treatment by direct modifying each 
step's gradient. Since preconditioning treatment depends 
on specific system, we don't go into more details about 
such topic. 

The modified CG algorithm shares a common feature 
with many other iterative methods of diagonalising ma- 
trices, such as Lanczos, Dividson, RMM-DIIS, and re- 
laxation method. In all these algorithms, one refines the 
trial vector in iterative subspaces. What makes the mod- 
ified CG algorithm different from other algorithms is that 
the iteration subspaces are spanned by the trial vectors of 
previous iteration steps, as well as the latest trial vector 
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and its gradient. The trial vectors of previous steps are 
already prepared, one needs only calculating one gradi- 
ent vector (and possibly does some preconditioning treat- 
ment) to construct the basis vectors of the iterative sub- 
space. Only two basis vectors of the iterative subspace 
are different from previous one, it needs only update two 
columns of the matrix elements in the iteration subspace. 
By including previous trial vectors into the iterative sub- 
space, one avoids reintroduces errors to the trial vectors 
in the previous directions of gradients. These properties 
of the iterative subspace make the modified CG algo- 
rithm numeric efficient. And the common feature of the 
algorithm makes it easy to implement. 

It is easy to formulate block algorithm for the modified 
CG algorithm to find several lowest eigenvectors simulta- 
neously. For this end, one refines several trial vectors at 
each iteration step. Here the iteration subspace includes 
all current trial vectors, their gradients, and all trial vec- 
tors of some previous steps. In this implementation, one 
needs to find several eigenvectors by solving the general 
form eigenvalue problem Q. Trial vectors obtained in 
this way are automatically orthogonal with each other, 
and one needs no additional subspace rotation. 

However, one step of block algorithm usually needs 
more floating point operations than sequentially process- 
ing each trial vector and maintaining orthogonality be- 
tween trial vectors by Schmidt orthogonalization method. 
This is mainly because the block algorithm needs more 
flops to form the matrix elements of TL and the cor- 
responding overlap matrix S. If one needs no lowest 
eigen-solutions for TV dimensional matrix, the block al- 
gorithm's iterative subspace is M = mriQ dimensional 
with m = 3, 4, • • • . Each step of block algorithm needs 
the following floating point operations: (a) uqNL flops 
for no matrix multiplying vector operations to obtain hq 
gradients, where L < iV is the band width of the ma- 
trix; (b) 2(mno)^-/V flops for the formation of the matrix 
elements of Ti. in the iterative subspace and the corre- 
sponding overlap matrix S; (c) An O ((mno)'^) floating 
point operations for solving the general form eigen- value 
problem (d) 2mn^N flops for combination the totiq 
basis vectors to form tiq refined trial vectors. Here, the 
flops in step (c) is negligible when no << N . The to- 
tal flops of one step block algorithm is a{'m,no, N) 
noNL + 2{mno)'^N + 2mnlN. If no = 1, the above float- 
ing point operations a{m, 1, N) = NL + 2m?N + 2mN 
is the flops for processing one trial vector in sequential 
algorithm. One the other hand, sequentially processing 
each trial vector one round needs noa{m, 1, N) + An^N 
flops. Here the second term is the flops to maintain the 
orthogonality of trial vectors, including making gradi- 
ents orthogonal to previous trial vectors. Even including 
subspace rotation which is performed after some rounds 
of sequential steps, the sequential implementation needs 
less floating point operations than the block algorithm. 

If no is small, e.g., uq < 10, the difference of flops be- 
tween block and sequential algorithm is small. The block 
algorithm may be one choice in such cases. Like the block 



Lanczoz [TJ , and block Dividson |g| , there are some other 
ways to form the iterative subspace to implement the 
block version of modified CG algorithm. For example, 
the iterative subspace may contain only one gradient, 
plus all the current trial vectors and some previous trial 
vectors. The choice of iterative subspace affects the con- 
vergence properties which needs further investigations. 
For large no, e.g., no > 100, to our experiences, block 
algorithm need more numeric cost and is less efficient 
as compared with the above sequential implementation. 
The dimension of the iteration subspace grows quickly 
with the number of needed eigenvectors, and one needs 
more memory to store the basis vectors and much more 
CPU time to solve the general from eigenvalue problem 
Q which increases drastically with the dimension of the 
iterative subspace. Since lowest eigenvector usually con- 
verges faster that higher ones, the number of iteration 
steps in a block algorithm is determined by the the vec- 
tor with slowest convergence rate. 



IV. CONCLUSIONS 

In summary, in the sense of conjugated gradient al- 
gorithm, we formulate an iterative method to find a set 
of lowest eigenvalues and eigenvectors of a matrix. This 
method minimizes the Rayleigh quotient of a trial vector 
via the gradient of the Rayleigh quotient, and at the same 
time, prevents reintroduce errors in the direction of pre- 
vious gradients. We realize such idea by refining the trial 
vectors in a special kind of iteration subspaces. Each it- 
eration subspace is spanned by the latest trial vector and 
the gradient of its Rayleigh quotient, as well as some trial 
vectors of previous steps. Each iteration step is to find 
lowest eigenvector in the iteration subspace. The gradi- 
ent, together with the previous trial vector, play the role 
of the conventional conjugated gradient. In our numeri- 
cal test, it is usual enough to include only one previous 
trial vector, i.e., one needs only refining the trail vector 
in a 3-dimensional subspace. As compared to the conven- 
tional conjugated gradient algorithm, which is designed 
to minimizes a general function, the current method ex- 
ploits special properties of eigenvalue problems, and thus 
converges much faster in many cases. During iterations, 
the trial vector at the step n, is an approximately lowest 
eigenvector in the subspace spanned by the initial trial 
vector and n subsequent gradient vectors. This is the 
reason of rapid convergence rate. The easy implementa- 
tion of this algorithm makes it suitable for first principle 
calculations. 
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